NeuCrowd: neural sampling network for representation learning with crowdsourced labels

نویسندگان

چکیده

Representation learning approaches require a massive amount of discriminative training data, which is unavailable in many scenarios, such as healthcare, smart city, and education. In practice, people refer to crowdsourcing get annotated labels. However, due issues like data privacy, budget limitation, shortage domain-specific annotators, the number crowdsourced labels still very limited. Moreover, because annotators’ diverse expertise, are often inconsistent. Thus, directly applying existing supervised representation (SRL) algorithms may easily overfitting problem yield suboptimal solutions. this paper, we propose NeuCrowd, unified framework for SRL from The proposed (1) creates sufficient high-quality n-tuplet samples by utilizing safety-aware sampling robust anchor generation; (2) automatically learns neural network that adaptively select effective networks. evaluated on both one synthetic three real-world sets. results show our approach outperforms wide range state-of-the-art baselines terms prediction accuracy AUC. To encourage reproducible results, make code publicly available at https://github.com/tal-ai/NeuCrowd_KAIS2021 .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Attributes from the Crowdsourced Relative Labels

Finding semantic attributes to describe related concepts is typically a hard problem. The commonly used attributes in most fields are designed by domain experts, which is expensive and time-consuming. In this paper we propose an efficient method to learn human comprehensible attributes with crowdsourcing. We first design an analogical interface to collect relative labels from the crowds. Then w...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Neural Simpletrons - Minimalistic Probabilistic Networks for Learning With Few Labels

Classifiers for the semi-supervised setting often combine strong supervised models with additional learning objectives to make use of unlabeled data. This results in powerful though very complex models that are hard to train and that demand additional labels for optimal parameter tuning, which are often not given when labeled data is very sparse. We here study a minimalistic multi-layer generat...

متن کامل

Learning a Neural-network-based Representation for Open Set Recognition

Open set recognition problems exist in many domains. For example in security, new malware classes emerge regularly; therefore malware classi€cation systems need to identify instances from unknown classes in addition to discriminating between known classes. In this paper we present a neural network based representation for addressing the open set recognition problem. In this representation insta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge and Information Systems

سال: 2022

ISSN: ['0219-3116', '0219-1377']

DOI: https://doi.org/10.1007/s10115-021-01644-7